CD ROM Paradise Collection 4

home *** CD-ROM | disk | FTP | other *** search

/ CD ROM Paradise Collection 4 / CD ROM Paradise Collection 4 1995 Nov.iso / program / flogger.zip / README < prev next >

Wrap

Text File | 1992-12-30 | 7.8 KB | 173 lines

VERSION This is version 0 of the the Sort Flogger. OVERVIEW This is a small collection of some of the more popular sort algorithms, including: bubble sort -- any 1st semester course in CS heap sort -- contributed by der Mouse insertion sort -- any 1st semester course in CS merge sort -- roughly patterned after Knuth Vol. 3 quick sort -- C.A.R. Hoare's as given in K&R 2 pg 87 shell sort -- D.L. Shell as given in K&R 2 pg 62 bogo sort -- worst-case sort by Richard Krehbiel A slightly weird test routine is provided which calculates some performance measurements on the above algorithms and also on the qsort() function provided with your system, as well as verifying that the sort actually worked and that it didn't modify anything immediately above or below the array, that it doesn't leak memory, and whether or not it's stable. These tests are carried out in a variety of situations designed to highlight worst-case behavior. These implementations of the sort functions are all compatible with qsort()'s parameter list, so if you think your application spends too much time sorting, you can just s/qsort/merge_sort/ and link in the sort library and give it a try. If you think it doesn't spend *enough* time sorting, try bubble_sort instead. The merge sort declares a global int called _maybe_sorted which tells merge sort that the data in the array may already be mostly in order. It is advisory--the algorithm will produce the correct results regardless of _maybe_sorted's value; however the execution time for merge_sort will be quite a bit less if this is set and the array is already mostly sorted. If the array isn't sorted at all, this option extracts about 2% speed penalty. The heapsort was contributed by mouse@larry.mcrcim.mcgill.edu aka "der Mouse". The merge_sort() function is the only one I've taken any trouble to optimize. INSTALLATION See the INSTALLATION document which should be nearby. USAGE Just type "flogger" or "flogger <number>" and watch what happens. DESCRIPTION OF OUTPUT It seems like it doesn't really do anything but spit out numbers, and rather slowly at that. The version and patch level of the program is printed; the resolution of the clock is printed; the size of the elements is reported; and the number of elements to be sorted is reported. For each sort algorithm, a table of measurements is reported. The rows describe the situation the sort algorithm was presented with for those measurements and the columns represent the type of measurement taken. The rows (situations) are: random: The contents of the array to be sorted are pseudo random numbers generated by rand(), with two special values excluded for the purpose of detecting overwrites at the ends of the array. ascend: The array is already completely in order. To simply verify this fact is all that's required of the sort algorithm; ironically, bubble sort outperforms all the rest in this case. descend: The array is exactly in reverse order. Theoretically, an sort function could detect this and swap ends in linear time, but it hardly seems useful. fib asc: The contents of the array are irrelevant; the comparison function passed to the routine always says that its first argument is less than its second. fib desc: Similarly, the comparison function always says that its first argument is greater than its second. surprise: A pseudorandom result is returned from the comparison function in an attempt to confound the algorithms' efforts at sorting. mostly: Similar to ascend, but the first and last elements are reversed. This is meant to simulate an index which is pretty much already sorted and only a small number of new items need to be inserted. equiv: The comparison function reports that all elements compare equal. The data is arranged so that a check can be made for whether the algorithm is stable; that is the property that records with equal keys stay in the same relative order in the array that they were in before the sort began. The columns refer to these measurements: compares: The number of calls to the comparison routine during the sort. I consider this very important, as most of the sorting bottlenecks I've experience center around the comparison function. stack: (Estimated) Some semi-non-portable attempts to record the growth of the stack are made and a number comes out. I consider O(log n) use of the stack acceptable on modern machines and none of these algorithms are worse than that. heap: Amount of malloc'd storage as reported by mallinfo(). If your C library doesn't support mallinfo(), the compiler will probably barf on flogger.c. The flogger checks for memory leaks. user: Amount of user mode CPU time as reported by times(). This includes time for both the operation of the sort function and the comparison function; as these are only marginally related, this number is not useful for estimating how long it might take to sort useful data in some other program. system: Amount of system CPU time as reported by times(). Generally 0.00 or very small. If a sort algorithm uses lots of system time it's probably a Trojan horse trying to break into the passwd file or delete your home directory or something so maybe you should start shuffling through your desk looking for a recent backup tape. I may remove this in a future version (the report of system time, not the Trojan horse) in favor of another metric, perhaps wall clock time. But it isn't enough simply to be fast, a sort algorithm also has to be correct. To that end, the flogger routine verifies that the array was actually sorted, at least when the comparison routine wasn't lying. It also checks that magic values in the memory locations immediately above and below the array were not accidentally accessed or overwritten. The bogo sort run is skipped if n is sufficiently large that it's execution would run into trillions of cycles (approx n = 15). MISCELLANEOUS When running the test routine, some sorts take so long that you might lose patience. Keyboard interrupt (aka Control-C) will abort the current sort test and move on to the next. If you really want the program to end, give it a quit signal (Control-\) or send it a signal with the kill command. You can also background it (Control-Z) from the C-shell then kill it. The size of the elements being sorted is hard-coded at the moment, but it can be changed by going into flogger.c and changing the typedefs for #define TEST_TYPE double #define TEST_FUNC double_compare into #define TEST_TYPE int #define TEST_FUNC int_compare or back again. Or add a new type and a new comparison function. Several of the sort algorithms are not re-entrant which means you shouldn't call the sort from within the very same comparison function you passed into that sort in the first place. COPYRIGHT Since none of the sort algorithms are mine and anyone with a copy of Knuth Vol 3 and a little knowledge of C could type them in as easily as I did, there isn't much point in claiming a copyright, much less applying for a patent. I wanted to disclaim liability though, so I put a few minor restrictions on the code, as set forth in each of the source files.